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ABSTRACT 

We present a Monte Carlo model for the structure of low mass (total mass < 25M0) planetary 
systems that form by the in situ gravitational assembly of planetary embryos into final planets. Our 
fT^ \ model includes distributions of mass, eccentricity, inclination and period spacing that are based on 

the simulation of a disk of 2OM0, forming planets around a solar mass star, and assuming a power 
' law surface density distribution oc a^^-^ . 

\ The output of the Monte Carlo model is then subjected to the selection effects that mimic the 

. observations of a transiting planet search such as that performed by the Kepler satellite. The resulting 

' comparison of the output to the properties of the observed sample yields an encouraging agreement 

in terms of the relative frequencies of multiple planet systems and the distribution of the mutual 
, inclinations, when moderate tidal circularisation is taken into account. The broad features of the 

• period distribution and radius distribution can also be matched within this framework, although the 

model undcrprcdicts the distribution of small period ratios. This likely indicates that some dissipation 
is still required in the formation process. 
Qh ' The most striking deviation between model and observations is in the ratio of single to multiple 

' systems, in that there are roughly 50% more single planet candidates observed than are produced in 

any model population. This suggests that some systems must suffer additional attrition to reduce the 
number of planets or increase the range of inclinations. 

Subject headings: planet-star interactions; planets and satellites: dynamical evolution and stability 



Oh! 



O 

52 ! 1. INTRODUCTION 

The search for planets around stars other than the Sun has revealed a great diversity of both planet mass and 
location. Jovian mass planets have been discovered both in very short period orbits (Mayor & Queloz 1995, Butler et 
''TT al- 1997), at distances more characteristic of the planets in our own solar system (Butler & Marcy 1996; Boisse et al. 
^ 2012), at great distances from their host stars (Udalski et al. 2005; Gaudi et al. 2008; Kalas et al. 2008; Marois et al. 

2008, 2010) and potentially even as free floating objects (Sumi et al. 2011). Lower mass planets are harder to detect, 
' but planets in the Uranus/Neptune mass range (lOM^-lOOM^) have also been found with a range of semimajor axes, 
I both close in (Howard et al. 2010; Mayor et al. 2011; Borucki et al. 2010; Holman et al. 2010; Lissauer et al. 2011a) 
• . and at large separations (Gould et al. 2006, 2010; Sumi et al. 2010). Furthermore, multiple detection techniques are 
^ • now able to probe below lOAf© (Wolszczan & Frail 1992; McArthur et al. 2004; Santos et al. 2004; Rivera et al. 2005; 
O ■ Beaulieu et al. 2006; Lovis et al. 2006; Charbonneau et al. 2009; Queloz et al. 2009; Mayor et al. 2009; Batalha 
' et al. 2011), potentially delving into the regime of true terrestrial planet formation. Indeed, the number of planet 
[ candidates emerging from the Kepler satellite data suggests that this class may soon dominate the demographics of 
J> ' known extrasolar planets (Borucki et al. 2011). 

The origin of these different classes of planet is still a matter of active discussion. The presence of so-called hot 
Jupiters close to their parent stars was initially held to be securely described within the context of planetary migration 
(Lin, Bodenheimer & Richardson 1996), in which giant planets form at distances ~ several AU, and migrate inwards 
- -' due to torques from the gaseous protoplanetary disk (Goldrcich & Tremaine 1980; Ward 1997). The discovery of 
significant misalignments between the planetary orbital angular momenta and the stellar star host spin (Winn et al. 
2010 and historical references therein) has cast some doubt on the applicability of this model to at least some of the 
observed systems, reviving interest in such mechanisms as planetary scattering (Rasio & Ford 1996; Nagasawa & Ida 
2011) or Kozai migration (Wu & Murray 2003; Fabrycky & Tremaine 2007; Naoz et al. 2011). Another significant 
problem for the migration paradigm is the fact that both radial velocity and transit methods find Neptune mass planets 
in a part of parameter space that such models predicted to be devoid of planets (Ida & Lin 2008). This does not 
necessarily imply that migration does not occur, but it does indicate that current models are either not universally 
applicable or are missing one or more crucial elements. 

Such advances have spurred new interest in the origin of low mass planets around other stars, and generated 
alternative proposals. It is possible that such planets did migrate inwards but were subject to additional perturbations, 
either from mutual gravitational interactions (Terquem & Papaloizou 2007; Ida & Lin 2010) or some other, stochastic, 
forcing (Rein 2012), in order to match the observed period ratio distribution in multiple systems. Alternatively, it has 
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been proposed that intense stellar irradiation can evaporate the envelopes of hot Jupiter planets, leaving behind only 
the Neptune- mass stellar core (Baraffe et al. 2004). However, current estimates of the evaporation rates for known 
Jupiter planets (Ehrenreich & Desert 2011) suggest that these do not compose a viable progenitor population. 

Instead, Hansen & Murray (2011) propose that this population is the result of in situ assembly of rocky planets, 
which may capture substantial, but not dominant, gaseous envelopes if they grow to sufficient size before the gaseous 
disk disperses. This proposal has the advantage that it naturally produces multiple planet systems, as are observed, 
and furthermore matches the statistical properties of the observed multiple planet systems. 

The original intent of the Hansen & Murray (2011) proposal was to match the planets observed in the radial velocity 
surveys descibed in Howard et al. (2010) and Mayor et al. (2011), which represent the tip of the iceberg of the low 
mass planet sample, by virtue of the detection limits inherent in the RV method. However, the Kepler mission (Borucki 
et al. 2011) has begun to expose more of the iceberg, suggesting that there are many systems with lower mass planets, 
offering the possibility of further testing the in situ assembly model. In particular, these lower mass planets most 
likely did not substantially increase their masses by accreting gas from the nebula and so should represent a more 
straightforward test of the model, without uncertainties in the treatment of gas accretion. Therefore, in this paper, 
we perform a quantitative comparison of in situ assembly models with a planetary candidate sample culled from the 
Kepler satellite data, under the assumption that these are primarily rocky planets and that their organization reflects 
the conditions that emerge directly from the gravitational assembly of the planetary system. 

In § [2] we describe our model for the original rocky protoplanetary disk, and its assembly into the final planetary 
configuration. We also develop statistical descriptions for the distribution of various important quantities, such as 
planetary masses and mutual inclinations which are then used to simulate the distribution of a large population of 
planetary systems in a Monte Carlo fashion, and to quantify their observability when studied in the manner of a search 
for transits. These are then compared to a well-defined subset of the Kepler candidates in § [3] and the implications 
are discussed in § El 

2. PLANETARY ASSEMBLY SIMULATIONS 

Hansen & Murray (2011), hereafter HMll, simulated the assembly of planets for a variety of rocky disks, with 
masses ranging up ^ lOOM^ within 1 AU, and accounting for both purely rocky systems and for the accretion of 
substantial amounts of gas from the surrounding nebula. This was dictated by a desire to match systems detected in 
radial velocity surveys, whose masses suggest rock-to-gas ratios of order unity (and whose existence is born out by a 
handful of transiting systems, such as HD149026b, HAT-P-11 and Kepler-9b,c). However, statistical analyses of the 
Kepler sample (e.g. Howard et al. 2012, Youdin 2011) suggest that this population contains a substantial contribution 
from bodies whose mass is predominantly rocky, even if their radii are inflated by a minority component (by mass) of 
gas. Therefore, this new sample, by virtue of its size and different detection technique, offers a new test of planetary 
origin models. 

Our model for the origin of these planetary systems is that the final assembly stage from planetary embryos to 
final planets occurred in situ, with the masses and separations of the final planets determined by the gravitational 
interactions between the assembling bodies. Whether the planetary embryos are formed from rocky material that 
condensed out of the gas in situ or migrated inwards as small bodies is not specified, and may be a function of total 
disk mass. In HMll, the disk masses assumed were in excess of SOM^ and most likely did require an enhancement 
over the heavy element inventory of the original local gas disk. However, for lower mass disks, such inward migration 
may not be necessary. 

Our model for the origin of the observed Kepler systems has a surface density profile S ~ a~^'^, normalised to 
contain 2OM0 of rocky material interior to 1 AU, around a IMq star. This is chosen to be a lower mass version of 
the density profile that best fit the observations in HMll, and the overall mass normalisation was chosen to be the 
maximum that would not result in many of the assembling protoplanets crossing the gas capture threshold defined 
in HMll within 1 Myr. Interestingly, these criteria yield almost exactly the same disk as has been recently derived 
by Chiang & Laughlin (2012), using an analogue of the Minimum Mass Solar Nebula argument, but applied to the 
observed Kepler candidate sample. This agreement suggests an encouraging self-consistency to our adoption of this as 
the underlying basis for forming a Kepler sample dominated by rocky planets. 

Our initial conditions are chosen using the formalism of Kokubo & Ida (1998) to assign initial masses and semi-major 
axes for the protoplanetary embryos at the end of the oligarchic accumulation stage, and are integrated forwards 
using the Mercury integrator (Chambers 1999) on the UCLA Hoffman2 Shared Computing Cluster. We perform 
100 realisations of this model. The inner edge of the original disk is taken to be at 0.05 AU, and so the time step 
for the integrations is only 12 hours, in order to resolve the innermost orbits. The integrations are run for 10 Myr. 
This proves sufficient to achieve dynamical stability, as evidenced both by inspection of the individual evolutionary 
histories and the integration of a subsample of 20 to ages of 100 Myr, without further significant change in the overall 
configuration. This assembly time scale is shorter than has been found for the solar system terrestrial planets (e.g. 
Chambers & Wetherill 1998; Chambers 2001) because of the higher mass and more compact nature of the disk. 

We then examine the ensemble of surviving systems to characterise various properties of the resulting population. 
The parameters of the final systems are shown in Table [l] including a variety of statistical measures collected together 
by Chambers (2001) for the purposes of characterising simulations of late stage planet assembly in the Solar System. 
Not surprisingly, the simulations here are both more closely packed (because of the larger mass in the disks relative 
to the solar system case studied by Chambers) and have the mass distributed over a greater range of radii (because 
we consider disks extending down to 0.05 AU). Furthermore, many of these measures are based on the distribution 
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Fig. 1. — The solid histogram shows the distribution of planetary multiplicity for one hundred realisations of our scenario, after 10 Myr 
(counting planets for which a < I.IAU). The dotted histogram is the best fit Poisson distribution for the number of planets in excess of 
two, but it is not sufficiently narrowly peaked (x^ = 3.9 per degree of freedom for bins A'p=3-8). The dashed histogram is the squared 
Poisson distribution discussed in the text, which is a much better fit (x^ = 1-1 per degree of freedom). 

Fig. 2. — The solid histogram is the distribution of final planet masses, normalised by the local value of cta/, for planets with a < 1.1 AU. 
The dotted line is a Rayleigh distribution, in the scaled variable M/a^. 



of mass, a quantity only indirectly accessible for the bulk of the Kepler sample. As such, we will use our simulation 
results to build a Monte Carlo model to perform a more direct comparison with the observables of a Kepler-style 
experiment. 



2.1. Multiplicity 

One of the most fundamental questions one can ask about the Kepler data set is how many planets a given system 
is likely to hold. Kepler has detected 361 stars (Fabrycky et al. 2012) which exhibit multiple transiting planets, with 
the record for the largest number of transiting planets around a single star currently held by Kepler- 11 (Lissauer et al. 
2011a) at six. Of course, the mutual inclinations of a set of planets orbiting a star will dictate what fraction transit, 
and therefore the transiting planets represent only a subsample of the true underlying system. 

Figure [1] shows the distribution of true multiplicity for our simulated ensemble, in which we count all planets that 
survive with semi- major axes < 1.1 AU, (this excludes a handful of surviving embryos on eccentric orbits, scattered 
out of the original disk), irrespective of mass. The median number of surviving planets, so defined, is four. 

Statistical analyses of Kepler data with pre-defined functions often assume a functional form for the multiplicity that 
is based on the Poisson distribution, (e.g. Fang & Margot 2012). We find that this is not a good fit to our simulations 
because it is too broad (i.e. it contains too many systems with either fewer or more planets than the median value). 
Furthermore, the minimum number of planets found in our simulations is three, so that our function should be adopted 
to describe the number in excess of two. To that end, we find that a better empirical model of the true distribution is 
to square the Poisson function for the number of planets in excess of two, i.e. 

\2(W-2) -2A 

p{N) = (1) 

((iV-2)!)' 

where A — 3.8 yields the best fit. 

The reason for the minimum multiplicity is related to the requirements of conserving mass, energy and angular 
momentum while assembling an extended distribution of mass into a handful of planets through mutual gravitational 
perturbations. We examine this issue further in § 12.3.11 



2.2. Mass and Radius Distributions 
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Fig. 3. — The solid points are masses and radii for extrasolar planets measured through transits and radial velocities, based on the known 
planet sample as of Dec 2012. The open points represent the solar system planets Venus, Earth, Uranus and Neptune. The solid and 
dotted curves are theoretical mass-radius relations from Seager et al. (2007), for Perovskite and Water respectively. The long dashed line 
is the empirical mass-radius relationship from Lissauer et al. (2011b), and the short dashed line is the relationship derived by Tremaine & 
Dong (2012). Our radius estimates are based on the solid curve, with a potential multiplicative factor R' applied, to account for potential 
Hydrogen envelopes. 

Fig. 4. — The solid histogram shows the distribution of A for the planet pairs with a < 1.1 AU. The vertical dotted line is the stability 
limit for Jovian mass pairs derived by Gladman (1993). The vertical dashed line is for A = 9 derived by Smith & Lissauer (2009) for 
earth-mass systems. We see that our distribution cuts off completely at this value, although the more accurate effective cutoff is A = 13, 
and the bulk of the pairs are more widely separated than this, peaking around A = 30, with few systems with A > 50. 



The masses of surviving planets exhibit a trend of increasing mass with semi-major axis. Thus, we fit the mass 
distribution as a Rayleigh distribution with a dispersion that depends on semi-major axis, namely 

/M = 4e-^("/-)^ (2) 

and 

Ura = 7Me(a/lAt/)°-^ (3) 

This resulting fit is shown in Figure [2l 

For comparison to the Kepler sample, we need to convert masses to radii. In the simplest incarnation, our model 
assumes that the masses of the planets are dominated by the rocky component. Therefore, we use the mass-radius 
relation of Seager et al. (2007), based on the Perovskite equation of state, to characterise our planetary radii. Figure [3] 
shows the current sample of planets with both mass and radius measurements. It is clear that many low mass planets 
have radii in excess of that expected from our no-gas model, suggesting that the observed Kepler planet possess 
atmospheres of lower molecular weight material, either water or gas. 

Previous authors have attempted to address this issue by interpolating between rocky planets and more massive, 
gaseous planets. Lissauer et al. (2011b) suggest a mass radius relation M = Mq{R/ Rq)'^-^^ , based on an interpolation 
between the solar system terrestrial planets and ice giants. Tremaine & Dong (2012), on the other hand, interpolate 
across the mass range probed by transiting planets. Wu & Lithwick (2012a) find an empirical mass-radius relation 
M — 3M^{R/ R^) by placing mass constraints on planetary pairs using transit-timing variations. In each case, these 
relations amount to an implicit, but not well characterized, assumption about the nature of the planetary atmosphere. 

We adopt a radius based on the Seager et al. relationship, but allow for a multiplicative radius enhancement factor 
R' , to account for a potential gaseous atmosphere. This is a reasonable approximation for the radius of a planet with 
a rocky core and hydrogen atmosphere of constant mass fraction, at least in the regime (gas mass fractions ~ 15% or 
less) necessary for the known exoplanets with masses < lOM^. A pure water planet is also well described in this way, 
with an enhancement factor R' = 1.3. 



2.3. Spacing of Planets 
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The spacing of planets in multiple systems potentially contains vital information about how the system was assem- 
bled. In HMll we examined a variety of statistics commonly used in assessing models of solar system formation, and 
found that the in situ assembly model was able to reproduce the properties of the systems observed by Howard et al 
and Mayor et al. Table [1] shows the same quantities for these simulations. 

One of the most common measures for planetary spacing is A, the orbital spacing between two planets measured in 
terms of their average Hill radius 

/ I \ -1/3 

^ _ - ai / mi + 7712 ^ 



a2 + fli V 3Af* 

where ai,mi and 02,^2 are the semi- major axes and masses of the inner and outer planets, and M* is the mass of 
the host star. Figure 2] shows the ensemble distribution of A in our simulations. The distribution is broad, peaking 
at A ~ 30. Various authors have evaluated the lower limit on A imposed by dynamical stability, with A > 9 being 
the expectation for earth- mass planets on initially circular orbits (Smith & Lissauer 2009). Our simulations do show a 
handful of systems just above this limit, but suggest that the more significant edge to the distribution lies at A ~ 13. 
At the other end, we find that A > 50 is rare, suggesting that transiting planet pairs with larger values likely host 
additional, as yet undetected, planets. 

The statistical measures described in HMll, and also A, are based on knowledge of the planetary mass. For most 
Kepler candidates, however, we have only radius measurements. Thus, a more appropriate measure for the compactness 
of these systems is the distribution of period ratios in neighbouring pairs. Lissauer et al. (2011b) and Fabrycky et 
al. (2012) have discussed this quantity and demonstrated that the Kepler sample shows a broad distribution with 
additional structure near commensurabilities. Figure [5] shows the distribution from our simulations, peaking at ratios 
just about two. There is some hint of structure near the 2:1 commensurability, but with neither the strength or the 
asymmetry seen in the observational sample. 

Lissauer et al. (2011b) also define the d statistic, to measure the proximity to commensurabilities of order i. Figure[6] 
shows the distributions of (i and (2 for our simulations. We see that there is some measure of avoidance of first order 
resonance in the simulations, and no corresponding feature with respect to second order resonance. However, we again 
see a symmetric structure rather than the asymmetry observed in Lissauer et al or Fabrycky et al. Nevertheless, the 
broader features of the distribution are close enough that they may provide the initial basis on top of which additional 
dynamical structure is created by dissipative effects such as tidal evolution, as proposed by Wu & Lithwick (2012b) 
or Batygin & Morbidelh (2013). 

We can cast the same information in the direct form of a distribution of separations, which we call S (since it is 
essentially the un- normalised version of A), shown in Figure [8] In this case, we also want to account for a slight trend 
of d with semi-major axis, fitting the distribution about this. The origin of the trend is the fact that annular spacing 
is dependant on the amount of mass in the disk, as discussed in ? 12.3.11 Thus, we normalise S to Sq, where 

(5o = 0.43- 0.23 logio(^) (4) 

and model the resulting distribution of e = 6/6q as 

/(e) OC e-^((^"0.82)/0.07)^ _^ q gg-i((e-^1.28)/0.12f _^ o.48e" 5 ((^-0.53)/0. 1)= ^ Q gg^_i((e_0.98)/0.05)=_ (^5^ 

This is the function we will use to determine the spacing of our planetary systems. However, since the mass and 
separations are chosen seperately, we will include an additional step of removing systems from the Monte Carlo model 
that violate the requirements 13 < A < 50. 

2.3.1. An approximate theory of planet spacmgs 

It is of interest to understand how the spacing and multiplicity of these systems is determined. The underlying 
process is the gravitational assembly of an extended mass distribution, regulated by both the requirements of global 
mass, energy and angular momentum conservation, as well as the requirement that the assembled bodies are sufficiently 
massive to perturb and eventually accrete any material nearby. 

The global conservation requirements are simple. Although gravitational perturbations can potentially eject material 
from a planetary system, the systems we simulate are sufficiently deep in their stellar potential wells that the amount 
of material lost via this mechanism is negligible, and similarly for accretion onto the star. Thus, for a disk with surface 
density E — So(^/^o) spread from ri„ to rout, the total mass, energy and angular momentum budgets are 

Affot = 27r2jo (O) 

2 — a 

1 — a 

5/2 — a 

(9) 
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Fig. 5. — The observed distribution of period ratios between neighbouring pairs (in which both he within a < I.IAU). The dotted lines 

indicate the positions of the 2:1 and 3:2 orbital commcnsurabilities. 

Fig. 6. — The upper panel shows period ratios now binned in terms of (^i, the proximity to the nearest first order commensurability. 
Exact resonance is at C^i = (shown by the vertical dashed line). The surviving systems appear to avoid orbits too close to resonance. The 
lower panel shows ^2 (proximity to a second order resonance) for those systems for which |(^i| > 1. We see little indication of structure in 
this instance. 



If this were to be incorporated into a single final planet, of mass Mtot, then energy conservation dictates that the 
final semi-major axis is 

1 - a ((w/ro)2-" - (r,„/ro)2-") 



2- a {{rout/roV-" - (n„/ro)i-") 



and angular momentum conservation requires that the eccentricity be given by 

{2-af (l-t/i-«)(y5/2-«_i)2 



= 



(a - l)(5/2-a)2 



,2-a 



1)3 



(10) 



(11) 



where y — rout/rin- It proves impossible to satisfy these two conditions unless y = 1 (i.e. unless wo specify an infinitely 
narrow ring, i.e. a single planet), for a > 0. This is because each annulus of the disk is assumed to be on a circular 
orbit and therefore carries the maximum angular momentum per unit energy. The energy integral is a harmonic 
average, which shifts the final location for the putative single planet inside the median radius of the disk, resulting 
in an excess of angular momentum per unit energy if one assumes all the mass is accumulated into a single planet. 
Thus, wc require at least two planets to conserve all of the mass, energy and angular momentum simultaneously in 
the assembly process. 

In addition, this assembly requires that the final planets be massive enough to gravitationally perturb and accrete 
all the material in the annulus into one or other of the final planets. If the original disk is not sufficiently massive, the 
final state may naturally consist of additional planets. We can estimate this, at least for the size of disk we simulate, 
by noting that our simulations suggest that A ~ 50 is an upper limit to the degree of separation of final planets. If 
an annulus produces two planets at the inner and outer edges of the annulus, and their mutual A > 50, then it is 
likely that a third, intermediate planet forms from material that is too far from either of the other two planets to be 
perturbed onto a crossing orbit. We can express A as 



73.7- 



y-1 f Mtot 2/^/2 - 1 



y+l\20Ms) 3.472 V0.05AU 



1/2 



-1/3 



(12) 



where r and y arc now running variables representing the inner edge and width of a particular annulus corresponding 
to the 'feeding zone' of a particular final planet. The number 3.472 represents the quantity y^/^ — 1 for the entire 
original disk from 0.05 AU to 1 AU, and Mtot represents the mass of the entire disk, normalised by a host star of mass 
IMq. We have also assumed a = 3/2 for this case. 
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Fig. 7. — The solid points represent Kepler planetary candidate pairs, plotted as period ratio versus the orbital period of the inner 
member of the pair. Coloured points and chains represent successive planets in the same system. The systems shown are all ones with one 
or more confirmed or validated planet - Kepler-11 (red); Kepler-20 (blue); Ke pler- 33 (green) and Kepler-45 (cyan). The curved dotted lines 
indicate curves of constant A, calculated assuming the relation in equation l|12p . The horizontal dashed lines indicate the period ratios 
associated with the 2:1 and 3:2 commensurabilities. 

Fig. 8. — The distribution of true (i.e. not weighted by mass) spacings of neighbouring planets shows several broad peaks. The distribution 
shown here has had a mean trend divided out. The dashed line shows the fitting function we use to characterise this distribution. 

Thus, as an example, assuming that one planet forms at the inner edge of the disk at 0.05 AU, the requirement that 
A < 50 suggests y < 2 under these conditions, so that the second planet lies no further than r = 0.1 AU. Using this 
location as the inner edge of the next annulus suggests y ^ 2.4 and a third planet at 0.24 AU. Similarly, this indicates 
y ^ 3.5 and a fourth planet at 0.85 AU, which is close to the outer edge of our disk. Thus, we expect at least four 
planets to form from our original disk and the simulations do indeed produce mostly four and five planet systems. If 
we repeat this exercise with A ^ 30 (closer to the median of the distribution), the values of y range from 1.5-2, and 
one can fit 6 planets between 0.05 AU and 1 AU, and 7 if the outer edge is 1.2 AU. Thus, stability considerations 
suggest that most systems should have between 4 and 7 planets, as is indeed observed. Note that the range of y 
encompassed by these arguments does allow for the presence of 2:1 period ratios. Even ratios in the 3:2 range are 
dynamically allowed, although they require A values between the median and the stability limit. 

One can convert the quantity y into a period ratio for each planet pair. Figure [7] shows the distribution of period 
ratios as a function of inner planet period for all planets in multiple systems that fall within our sample cuts in § |3l 
Also shown are lines of constant A using equation (jl2p . This is another way of representing the distribution shown in 
FigurelH and we find that indeed the bulk of the planets cluster in the range 20 < A < 40 and that the limits at 10 and 
50 are good delimiters of the edge of the distribution. Note that the observed data are tabulated only using periods, so 
that the uncertainties in the mass-radius relation are not relevant. Figure [7] also shows coloured chains for a handful 
of confirmed, high multiplicity Kepler systems. We see that the outer pair of the Kepler-20 system (blue) has a period 
ratio that puts it in the high A wing of the distribution, suggesting that an additional planet may be waiting to be 
found between Kepler-20f and Kepler-20d. Although the period spacing between Kepler-llf and Kepler-llg deviates 
somewhat from that of the inner planets in the system, it is still well within the expected distribution, so that we do 
not necessarily have a strong motivation to expect an additional planet in that gap. 

2.4. Eccentricities and Inclinations 

Our initial conditions place all the planetary embryos on circular orbits, but the process of assembly generates 
eccentricities by both secular interactions and direct scattering and collision (e.g. Chambers & Wetherill 1998). The 
resulting distribution of planetary eccentricities is given by Figure |9l We see that the tail of large eccentricities is 
somewhat larger than expected for a Rayleigh distribution (the normal assumption for the functional form). A better 
fit is to replace the Gaussian function in the fitting formula with a simple exponential, so that 

/(e) = ;^.-".(l-(l+J-) (13) 
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Fig. 9. — The solid points are eccentricities for all surviving planets interior to 1.1 AU. The dotted line is our exponential Rayleigh 
distribution, with (Te = 0.055, giving a mean eccentricity of 0.11. 

Fig. 10. — We show here the inclinations, relative to the original orbital plane, of all surviving planets in our 100 realisations of the 
model. We see that the range of inclinations at semi-major axis < O.IAU is somewhat larger than those for O.IAU < a < lAU, so we will 
fit to those two distributions seperately. 

with CTe — 0.055. This results in a mean eccentricity of 0.11, broadly consistent with the analysis of Moorhead et al. 
(2011) based on the distribution of Kepler transit durations. 

The underlying eccentricity distribution will also be affected by tidal circularisation for those systems with sufficiently 
small semi-major axis. In the absence of a more detailed model for the dissipation in these low mass planets, we can 
adopt the traditional parameterisation and estimate the circularisation semi-major axis as 

/ T \ 2/13 / 7? \ / A T \ -2/13 / n \ -2/13 

"-^"(sl;) (^) (w;) (m) 

where T* is the age of the host strs, Rp and Mp are the masses of the planets and Q is the quality factor of the 
dissipation in the planet. Thus, values of Q ~ 10-1000 will place Qdrc = 0.1-0.2 AU. 

The inclinations of the planetary orbits are of particular interest in the case of Kepler comparisons, since they will 
determine what fraction of a system's planets actually transit the host star from a given observational direction. The 
ensemble of inclinations (relative to the original orbital plane) is shown in Figure [TOl We see that there is a trend 
for larger inclinations at small semimajor axis, so we model this as two separate inclination distributions, shown in 
Figure fTTl For semi-major axes a < O.IAU, we model the distribution as 



/i(*)ocze-^/4° (15) 



while, for a > O.IAU, we model it as 



oc Ae-5(V0.9°)^ ^ o.35e~^ i'^l + 0.008^e-^(*/iO-3°)'. (16) 

We see that this latter distribution contains a component corresponding to a Rayleigh distribution with a dispersion 
similar to those determined by previous analyses of the Kepler data (e.g. Lissauer et al. 2011b, Fabrycky et al. 2012, 
Tremaine & Dong 2012, Fang & Margot 2012). However, it also contains a higher inclination component which may 
bias the inferences based on an assumed distribution of the Rayleigh form. Similarly, the inclination distribution for 
the closest planets has a larger tail to high inclinations than that of a Rayleigh distribution. 

The inclination distribution describes the orientation of the orbital plane relative to that of the original disk. In 
order to fully describe the orientation relative to a given observer, we also need to specify the angle of the line of nodes. 
Figure [H] shows the offsets in this angle, ft for neighbouring pairs of planets, calculated in the original orbital plane 
of the disk. We see that the orientations of neighbouring orbits are not completely uncorrelated, and that there is a 
tendency to avoid orbits that are exactly aligned with each other. We model this with the simple piecewise distribution 
shown in Figure [12] 



Hansen & Murray 



9 



20 p 
15 - 



10 



a < 0.1 Au 



30 



20 



10 15 



20 




0.1 Au < a < 1 Au 



10 





inclination (degrees) 



50 100 
AQ(degrees) 



150 



Fig. 11. — The points in the upper panel show the distribution of inclinations (relative to the original orbital plane) of all planets inside 
0.1 AU. The points in the lower panel show the same quantity for O.IAU < a < I.IAU. The dashed lines show the models described in the 
text. 

Fig. 12. — The binned data show the difference in the longitude of the ascending node for neighbouring pairs of planets. There is a slight 
avoidance of orbits that axe too close to aligned and a slight preference for Af2 ~ 90°-160° . We model this using the simple distribution 
given by the dashed line. 



2.5. Monte Carlo Model 

Kepler has detected thousands of planets in several hundred systems, which represent a sampling of a potentially much 
larger parameter space. Simulating such a data set directly is unrealistic, but we can use the parameter distributions 
of the previous sections to construct a Monte Carlo model for the generation of planetary systems. This will produce 

model populations with the same properties as the simulations, except for the exclusion of any parameter correlations 
(such as the mass-period trend and nodal distribution) not explicitly identified. 

We draw planetary separations from the distribution function /(e) and planetary masses from /(m) (out to distances 
of 1 AU). We then evaluate the value of A for each pair and reject any systems containing a pair that violates A < 13 
or A > 50 (this is a small minority of cases because the distribution of /(A) is closely related to /(e) and /(m)). We 
also draw distributions of inclination and eccentricity from our functions /(e) and /i(i) or /2(«)- We convert masses 
to radii using the Seager et al. equation of state and the value for R'. 

The sample of true underlying planetary systems is then 'observed' by randomly choosing a particular plane of 
observation and calculating the inclinations of all planetary orbits relative to that plane. We also account for the 
distribution of the planetary lines of nodes with respect to a particular direction chosen for the observer. All planets 
whose impact parameters are < IRq are then counted as transiting. In order to avoid confusion, we will refer to these 
as 'tranets', in the parlance of Tremaine & Dong (2012). 

In order to match with the observations, we must also reject tranets whose radii are too small to be detectable. In 
the interests of clarity, we will restrict our comparison to the parameter range where more detailed studies (Howard 
et al. 2012; Youdin 2011) suggest that the completeness is high. These prior studies suggest that the Kepler database 
is relatively complete for R > 1.5i?0 and P < 200 days, and so we adopt this as our conservative observational limit 
criteria. 



3. COMPARISON TO OBSERVATION 

In order to compare our Monte Carlo model to the observations, we adopt the following criteria (following Howard 
et al. 2012 and Youdin 2011) to define a complete subsample of the publicly released Kepler planet candidate catalog 
as of Jan 2013 (although we also compare to the sample announced in Batalha et al. (2012) in case there have been 
any systematic changes in sample selection with time). Since our simulations are for systems around a IMq star, we 
adopt 4100if < Teff < 6100K and \ogg > 4 in order to approximately match the stellar hosts to the model. We also 
require Kepler magnitudes < 15 to ensure that the completeness is not reduced because of lower photon statistics and 
signal-to- noise. We also then require P < 200 days and 1.5Rq < R < lOi?©. This leaves us with 463 single tranet 
systems, 79 two tranet systems, 26 three tranet systems, 7 four tranet systems, 1 five tranet system and 1 six tranet 
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Fig. 13. — The solid points, with error bars, represent the multiplicity ratios from the Kepler sample as defined in the text. The circles 
are drawn from the Jan 2013 Kepler catalogue, while the triangles are drawn from the previous incarnation (Batalha et al. 2012). Other 
symbols show samples defined by other authors from Kepler data. The open circles are those from the study of Lissauer et al. (2012) 
based on the first major sample of multiple tranets. The crosses are for the sample determined by Tremaine & Dong (2012) and the open 
triangles are those of Fang & Margot (2012) (Figuera et al. 2012 are also consistent, although not plotted because they didn't account for 
five or six tranet systems). The consistency of all such samples indicates that this comparison is relatively robust with regards to sample 
selection. The relative horizontal offsets are arbitrary and for the purposes of clarity. The solid histogram represents the expected ratios 
from our Monte Carlo model with a radius enhancement of _R' = 1.2. The dashed histograms correspond to the same model but with 
_R' = 1.0 (long dashes) and R' = 1.4 (short dashes). We see that the model robustly reproduces the multiple tranet ratios and is not heavily 
sensitive to R' . In all cases the ratio of singles to doubles is dramatically undercounted. The dotted histogram shows the equivalent of the 
solid histogram except that we have assumed that another process produces single tranet systems only, at a rate equivalent to that which 
produces the multi tranet systems. 

Fig. 14. — The solid points are the observed distribution of ^ for the sample defined in the text from the Jan 2013 catalog. The open 
points are defined in the same way but using the earlier sample from Batalha et al. (2012). The dotted histogram represents the distribution 
from our model, with R' = 1.2 and using the full eccentricity and inclination distributions. The dashed histogram represents the model 
result if we keep the same inclination distribution but set all eccentricities to zero. The solid histogram represents an intermediate model 
in which orbits are circularised only for a < 0.25AU. Note that all points outside the vertical dashed lines are excluded from the fit because 
of low number counts. 



system (For the Batalha et al. (2012) catalog, the numbers are 463:85:25:2:2:1). 

3.1. Mutual Inclinations 

One of the simplest measures of comparison between data and observation is the ratios of systems of different 
multiplicity. Figure [T3l shows the comparison of observations with the statistics of our Monte Carlo model for different 
values of the radius enhancement i?', as well as equivalent measures from several other studies in the literature. We 
see that the models can reproduce the ratios of triples to doubles and of triples to systems with four or more tranets. 
We also see that this consistency is robust to uncertainties in R' , as the predicted ratios don't change much over a 
range of R' consistent with the known planetary mass-radius relation. There is also a consistency of the observed 
ratios despite different sample selections by different authors (Lissauer et al. 2011b; Tremaine & Dong 2012; Fang & 
Margot 2012), suggesting that the comparison is also robust with respect to the particular observational cuts. 

However, our models fail to reproduce the observed ratio of multiples to single tranets, in the sense that the Monte 
Carlo model severely underpredicts the number of systems which show a single transiting system. Lissauer et al. 
(2011b) noted the potential excess of singles and argued that simple geometric considerations suggest that the ratio 
of singles to doubles should be higher unless there is an unusual correlation or inclination dispersion. To that end, 
we paid special attention to modelling the high inclination portion of the distributions in Figure [TT] and to potential 
correlations in the lines of nodes, but neither of these appears to have materially affected the results. Although other 
studies such as Tremaine & Dong (2012) and Fang & Margot (2012) claim consistency in their modelling of all the 
data, this is because their chosen statistical distributions allow for a substantial population of single transiting systems, 
something not allowed by our more physically derived model. 

We can, of course, simply postulate a second unknown process at work that produces only single tranet systems, 
which can then be combined with our model. If we postulate that this second process operates at the same rate as 
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Fig. 15. — Solid points represent the period ratio distribution of our observational sample. The solid histogram is the corresponding 
distribution from the Mo nte C arlo model with R' = 1.2. The dotted histogram is for the same parameters but drawn from the model with 
an adjusted inner edge I3.3| l. 

Fig. 16. — The solid points are the same as for Figure [TSl but the solid histogram now represents only the period ratios in systems where 
four or more tranets are detected. The dotted histogram represents the period ratios of systems in which two tranets are detected. The 
distribution of high multiplicity systems better shows the two peaked structure evident in the observations, albeit shifted slightly to longer 
orbital periods. 

our model, we predict a ratio of singles to doubles ~ 0.2, which matches the observations. 

A second, more accurate measure of the mutual inclination distribution is the velocity-normalised transit duration 
ratio 



1/3 



Tdur,/-P, 



1/3 



(17) 



where Tdur is the transit duration and P is the orbital period (Fabrycky et al. 2012). This is an empirical measure of 
the ratio of the impact parameters for two neighbouring tranets, which should scale as the semi-major axis ratio for 
perfectly aligned systems. Figure [Til shows the distribution of ^ that results from our Monte Carlo model, compared 
to that of the observed sample as defined above. The dotted histogram is for R' — 1.2, using the eccentricities and 
inclinations drawn from the simulated systems. We see that the model distribution captures the wings correctly, but 
underpredicts the peak near ^ ~ 1. This behaviour is the same for all values of R' . The dashed histogram shows the 
model distribution if we use the same R' = 1.2 and full inclination distribution, but circularise all the orbits. We 
see that this model captures the observations very well, with a P^r degree of freedom of 1.15. Indeed, reasonable 
agreement can be obtained by only circularising orbits within a certain upper semi-major cutoff. The solid histogram 
shows the ^ distribution for the case in which this limit is 0.20 AU, which defines the 2a level with respect to the 
fit obtained with full circularisation. We note that the improvement in the fit is solely due to the reduction in the 
eccentricity - we do not assume any change in the mutual inclinations as a result of tidal effects. 

3.2. Period Distributions 

The comparison of the model period ratio distribution to observations, shown in Figure[T5l is not quite as good as the 
various measures that depend on inclinations. Some of the structure is reproduced, but the simulations underpredict 
the number of close pairs relative to the number of more widely separated pairs. Furthermore, although there is a 
double peaked structure to the distribution between period ratios of 1-3, the peaks are shifted to larger values than 
observed. Indeed, the entire shape of the distribution appears shifted by ~ 5% to larger values. This suggests that 
some level of dissipation is required beyond the simple assembly model assumed here. It has been suggested that 
such enhancements near commensur abilities, and the slight shift in the observed values, could be the result of tidal 
evolution (Wu & Lithwick 2012b; Batygin & Morbidelli 2013). We have not implemented this effect, although we have 
included the semi-major axis shift associated with circularisation for values less than 0.20 AU. 

Some of this mismatch may also be due to poorly quantified selection effects. Steffen (2013) has noted that the period 
ratio distribution is subtly different between high multiplicity (four or more tranet candidates) and low multiplicity 
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(two tranet candidates) systems. In Figure we show the corresponding comparison between the same data as in 
Figure [TSl but now with two distributions drawn from the model in the same spirit as Steffen (2013). We see that 
the high muhiphcity model systems actuahy do a much better job of reproducing the data than the lower multiphcity 
systems. The mismatch in Figure [15] is clearly the resuh of the stronger weighting towards the two tranet systems as 
in Figure [T31 but the structure here suggests that selection effects that favour more high muhiphcity systems could 
potentially improve the comparison as much as moderate radial migration. We note also that the distribution from 
the two-tranet systems is indeed shifted slightly compared to the higher multiplicity systems, as suggested by the 
comparison by Steffen. 

We can also compare our simulations to the distribution of absolute periods of detected tranets. This is shown in 
Figure flTl We see that the shape of the distribution is well reproduced for P >10 days, but that the simple model 
overproduces short period tranets by a factor of three near the peak of the distribution. This is likely a consequence 
of our rather naive application of a single model realisation to the full population, in the sense that our planetary 
embryo disks all have an inner edge at 0.05 AU and therefore almost always produce a planet with an orbital period 
< 10 days. If the true underlying model for the planetary initial conditions contains some variation in the location of 
the inner edge, then this peak will be reduced. 

3.3. Model Embellishments 

We have kept our default model simple in order to highlight the robustness of the match to the data, but we can 
also demonstrate that a more comprehensive match to the data is achievable with minor modifications to the model. 

One mismatch observed in the previous subsection is the overproduction of tranets with orbital periods < 10 days. 
This is due in large part to the assumption that all protoplanetary disk have the same inner edge. To demonstrate 
this, we consider a modification to our Monte Carlo model in which we adopt an inner edge to the original disk of 
0.08 AU, in 55% of cases. In the other 45% of cases, we adopt the original model. We use the same mass, orbital 
spacing, eccentricity and inclination distributions as before. In effect, we simply remove those planets that lie inside 
0.08 AU in 55% of the cases. We show the results of this model in Figure [T71 This modification does not change the 
quality of the fits to the ^ distribution or the period ratio distribution. 

Similarly, we have not attempted to directly match the observed distribution of radii, because of the distinct possi- 
bility that there may be significant stochastic variation in the mass of hydrogen accreted/retained by planets. Never- 
theless, one can match the observed radius distribution with a suitable choice of mass-radius relation. Figure fT8l shows 
the observed distribution of tranetary radii for the sample identified in § [3l compared to two distributions derived 
from our Monte Carlo model. The dotted histogram shows the results for R' — 1.2, which captures the median value 
of the population but does not match the shape. The solid histogram shows the result from the same model except 
that we now adopt R' — 1 + 0.05{M/Mq). This provides a good fit to the tail of the distribution at larger radii, and 
still matches the other observational constraints. This relation bears some resemblance to both the interpolation of 
Lll and the linear fit of Wu & Lithwick (2012a), although the nature of the fit means it is weighted more by lower 
mass objects than the higher masses as in the case of Wu & Lithwick. The effective physical model described by this 
relation is one in which planets have Hydrogen mass fractions < 1% for masses ~ Af^, increasing gradually to roughly 
50% at masses ~ 3OM0. 

4. DISCUSSION 

Our study follows on the heels of several previous analyses of the Kepler multiple tranet candidate systems (Lissauer 
et al. 2011b; Fabrycky et al. 2012). All of these studies agree that the underlying planetary configuration is best 
modelled by a population that has an inclination dispersion of only a few degrees, and that this is also consistent 
with the radial velocity sample (Lissauer et al. 2011b; Fabrycky et al. 2012; Figueira et al. 2011; Tremaine & Dong 
2012; Fang & Margot 2012). Our analysis affirms these conclusions with the additional benefit of being drawn directly 
from a physical model, rather than a set of ad hoc postulates. Our inability to match the single tranet numbers is 
also foreshadowed either directly (Lll) or by best fit models that require a large fraction of low multiplicity systems 
(Tremaine & Dong 2012; Fang & Margot 2012). 

The fact that our underlying model always produces three or more final planets brings this issue into stark contrast. 
The most convenient explanation is that that the rate of false positives is higher amongst the single tranet sample than 
the multiples, but this would require that current estimates of the false positive rate are severely in error. Lissauer et 
al. (2012) have argued that the rate of false positives amongst the multiple systems is very low, but this necessarily 
excludes the single candidate systems. Santerne et al. (2012) has claimed that the false positive rate for Jupiter-size 
candidates is higher (~ 35%) than originally estimated (Morton & Johnson 2011), but it is not clear that the same 
false positive rate will apply to systems with smaller eclipses. Detailed simulations of the Kepler detection process 
suggests a false positive rate in the range 7%-16% for the size of planets in our sample (Fressin et al. 2013). Thus, it 
seems that we require either an entirely separate contribution of single tranet systems or a process that modifies or 
edits some fraction of the systems that assemble according to our model. 

Although we have neglected any effects due to large-scale, post-assembly planetary migration in this calculation, 
it could potentially be the alternative contributing process we seek, especially if the late-time instability of resonant 
chains (Terquem & Papaloizou 2007; Ida & Lin 2010) produces primarily systems of low multiplicity. However, in that 
event, it is surprising that both processes produce essentially the same orbital period distribution. Figure [T9l shows the 
cumulative distribution for tranets (1.5i?0 < R < lOi?®) in single transit systems (open circles) and multiple transit 
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Fig. 17. — Solid points are the data, for all Kepler tranets, in both single and multiple systems, for the sample defined in the text. The 
solid histogram is for a model wit h R' = 1.2 , while the dotted histogram is the same, but in the case where we have adjusted the inner 
edge of the distribution (see S I3.3|l . 

Fig. 18. — The points show the distribution of radii for the planetary candidates in the observed sample, while the dashed line indicates 
the cutoff radius. The dotted histogram is the distribution that emerges from our default model with R' = 1.2. It reproduces the average 
value of the observed distribution but clearly undercounts both the low and high radius tails. The solid histogram shows the result if we 
use the mass dependant radius correction i?' = 1 + 0.05(M/Mgj), which does a better job of reproducing the observed shape. 

systems (filled circles). There is little evidence of any difference between the two populations that might indicate 
a different origin scenario for ^ 50% of the single systems. Furthermore, there is also little evidence of any relative 
incompleteness based on host star magnitude. Figure shows the ratio of double to single tranet systems as a function 
of the Kepler magnitude of the host stars. We see that the ratio remains essentially constant from Kepler magnitudes 
12-16, well beyond our nominal completeness limit. 

Alternatively, this could simply indicate that our model is incomplete in the sense that the systems are assembled 
as we describe, but are then subjected to further dynamical influences that remove some planets from the system 
and reduce the multiplicity. Indeed, we have simulated systems of terrestrial class planets only, without consideration 
of the interaction with potential gas giant planets in the same system, despite the fact that we know such planets 
exist in our own solar system and others. It is well known that even limited migration by giant planets during the 
loss of the protoplanetary nebula can cause the frequencies of secular interactions to evolve, potentially resulting in 
resonant interactions, eccentricity excitation and eventual instability in lower mass planetary components (Ward 1981; 
Nagasawa, Lin & Ida 2003; Levison & Agnor 2003; Thommes, Nagasawa & Lin 2008). Alternatively, giant planet 
secular evolution by gravitational interactions alone may be enough to stimulate dynamical evolution of lower mass 
systems (e.g. Veras & Armitage 2003; Laskar 2008; Lithwick & Wu 2011; Matsumura, Ida & Nagasawa 2012) Thus, 
it is possible that some of this dichotomy may simply reflect the role of a more distant component of giant planets in 
some systems but not in others. 

4.1. W(h)ither Migration? 

We have demonstrated that a simple in situ assembly model can reproduce the broad distributions of tranetary 
periods and inclinations observed in the Kepler database. However, this model seems to underpredict the frequency 
of close pairs (i.e. pairs with period ratios < 2.5) nor does it completely reproduce the observed structure near the 
3:2 and 2:1 commensurabilities. These suggest that, while the broad organisation of the observed planetary systems 
appears to be consistent with a primarily gravitational process, matching the fine structure will likely require a model 
that includes some means of additional dissipation. 

Part of this may be provided by tidal dissipation in the planets. We have already demonstrated that the fit to 
the ^ distribution can be improved if we allow for the circularisation of orbits inside ~ 0.20AU (corresponding to 
Q ~ 10-50, depending on planetary mass and radius). This process is promising in that it can reduce the eccentricity 
while preserving the inclination distribution of the planetary orbits, which seems to fit the observations well. 

It is somewhat surprising that there is so little need for migration in fitting these observations. The rapid movement 
of planetary mass objects in a gas disk has been a topic of great interest for the last several years. The strength of 
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Fig. 19. — The solid points are the cumulative distribution with period for Kepler tranets in multiple systems. The open points are the 
same quantity for single Kepler tranets. There is no evidence that the two tranet samples are drawn from different distributions. 

Fig. 20. — The solid points show the ratio of double to single tranet systems, if we restrict our sample to host stars brighter than Kepler 
magnitude Km- Error bars are dictated by Poisson errors on the double tranet systems. The vertical dashed line is the limit wc use to 
define our sample in § [S] and the shaded region shows the range of values we expect, based on our simulations. If some double tranet 
systems were being identified as only single tranet systems because of low signal-to-noise, we would expect this ratio to decrease. 

this interaction was expected to result in very few planets of the observed mass range at the orbital periods observed, 
(e.g. Ida & Lin 2008). One way to slow the rate of migration is for strong gravitational interactions between planets 
to counteract the transfer of angular momentum from orbit to gaseous disk, but this is expected to lead to systems 
with a large fraction of the systems in or close to resonance (Terquem & Papaloizou 2007; Ida & Lin 2010). Although 
some resonant systems are seen, it appears as though a migration model requires a significant additional stochastic 
forcing to match the observed period distribution (Rein 2012). 

It is possible that this mismatch simply reflects the limitations of current models for how planets interact with the 
gaseous disk, as the effects of migration depend critically on the interaction between the planetary forcing and the 
structure of the disk (e.g. Terquem 2003; Laughlin, Steinacker & Adams 2004; Johnson, Goodman & Menou 2006; 
Paardekooper & Mellema 2006; Paardekooper et al. 2010; Bromley & Kenyon 2011; Hasegawa & Pudritz 2011; Kretke 
& Lin 2012). This is because the speed and even direction of migration depends on the imperfect cancellation between 
the torques exerted by the disk interior and exterior to the planet. One possibility, suggested by several authors (e.g. 
Masset et al. 2006), is that the torque may reverse sign at particular locations in the disk, leading to 'planet traps' 
- preferred locations where planets assemble. It is possible that our power law surface density profile may simply 
represent an averaged version of a disk with several such preferred locations, and that some disks may have more 
localised distributions which could contribute to the overabundance of low multiplicity systems. Indeed, this might 
help to place our Solar System in the Kepler context, as our own terrestrial planet system is potentially the outcome 
of a disk with a single localised planet trap (Hansen 2009). 

4.2. Predictions and Speculations 

The primary goal of this study is to demonstrate that a simple model of in situ assembly of rocky planets can repro- 
duce the distribution of observed parameters in the Kepler multiple tranet sample. However, given that consistency, 
it is useful to attempt to outline additional analyses or observations that can further refine the comparison. 

Some of the consequences are rather obvious. The fact that neighbouring tranets with large period ratios can harbour 
additional unseen planets has been noted before (e.g. Lll), but we can provide a prediction of the quantitative threshold 
for such predictions. We have seen that our simulations rarely produce neighbouring systems with A > 50, suggesting 
that multiple systems with larger values should harbor planets in these apparent gaps. Our model suggests planetary 
systems are factors of 2-3 less densely packed than a 'maximally packed' system just on the edge of stability, so that 
the threshold of A ~ 50 should mark a change in the frequency of non-transiting companions. Indeed, Figure [^l] 
shows that the distribution of A that emerges from the simulations shows a qualitative break at A 50. The high 
A tail of this distribution is thus the group that should contain non-transiting interlopers. We can compare this to 
the observed distribution of A in the observed sample. In this case we use the catalog of Batalha et al. (2012) in 
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order to use later additions to the catalog to assess any predictive power our model comparison may uncover. A 
comparison with the data also requires the adoption of a mass-radius relation. Using the mass-radius relation used by 
Lll and F12, to maintain consistency with previous analyses, we define our observed A as A^. The resulting observed 
distribution shows a very similar feature, but the break occurs at A^ 30. The similarity in shape between the two 
distributions suggests a measure of renormalisation is in order, and that the values of A/, used in Lll and F12 are 
likely underestimated by ^ 60%, or that the Lll mass-radius relation overestimates the planetary mass, at least in 
the lower mass regime. Table [2] lists the observed pairs that have Al > 30 in Figure \T[\ If we estimate masses from 
orbital periods for the entries in this table with 30 < A/, < 33, using equation ([3]), we find a range 38.4 < A < 50.5 
for 11 systems, with a median of A = 48. Thus, it suggests that the renormalisation from A^ = 30 to A = 50 is 
consistent with the difference in the adopted models for the planetary masses. 

As can be seen from the notes to Table [21 there are several systems in which we have strong evidence for the 
presence of planets in the ostensible gaps. The systems plotted in Figure [2T] were selected for consistency with our 
model comparison, namely we restricted our attention to planets with radii > 1.5i?^. In two of the cases, additional 
tranets (KOI 70.05 and KOI 117.04) were already known but not included in the above comparison because their radii 
fell below the radius cutoff. The addition of further data beyond the first six quarters has also resulted in the detection 
of tranets in the gaps in three other (KOI-116, KOI-593 and KOI-671) systems. In three systems (KOI-316, KOI-341 
and KOI-582) additional observations have cast doubt on either the validity of one of the candidates themselves, or 
on one of the periods (which would therefore change the A^). 

In addition to directly observing a transit by an additional planet candidate, it is possible to identify the presence of 
non-transiting companions by observing variations in the timing of transits by known tranets, due to the gravitational 
perturbation of the unseen companion (Agol et al. 2005; Holman & Murray 2005; Veras, Ford & Payne 2011). Ford 
et al. (2012) identified 1436 systems with potential transit timing variations (TTVs). If we restrict our attention 
to systems with two candidate tranets, with the same stellar, planetary and magnitude cuts as before, we are left 
with 11 potential systems with median deviations from a linear ephemeris that are larger than the quoted dispersions 
representing transit timing accuracy. Figure [22] shows the cumulative distribution of these 11 systems as a function of 
Al, along with the same distribution for the full sample of observed double candidate systems (subject to the same 
cuts) . We see that the systems with potential TTVs are indeed skewed to larger A ^ , and that the maximum deviation 
occurs at Ai ~ 30, as expected. The small number of candidate systems means that this result is still statistically 
weak (A Kolmogorov-Smirnov test suggests that the TTV-selected distribution could be drawn from the larger sample 
with a probability of 7%), but this should hopefully improve with longer baselines of observation. Furthermore, Steffen 
et al. (2012) identify candidate pairs which may show anticorrelated TTV signatures, and many of the pairs in Table[2] 
are identified as worthy of further study. A better comparison may be possible soon as Mazeh et al. (2013) promise 
an improved catalog will be shortly forthcoming, based on a longer baseline of data. 

It is also worth noting that a first prediction of this model, based on multiplicity statistics, was made in HMll. In 
that case, three multiple planet systems were identified in which the detection of additional planets would improve 
the match with the models. The recent announcement of additional planets in one of these systems, IID40307 (Tuomi 
et al. 2012), does indeed improve the agreement with the original models. Although this is hardly a revolutionary 
prediction (a linear trend in the radial velocity data was already known at the time of writing) , it is still encouraging 
that additional detections improve the agreement between the model and the data. 

If the overabundance of single tranet systems is indeed the result of a systemic instability induced by the presence of a 
giant planet system at larger radii, then it would of interest to study two samples (one set of multiple candidate systems 
and another of single candidates of similar radius) of Kepler systems with radial velocities, in order to characterise 
whether there are any significant differences which may betray divergent dynamical histories. One possible signature 
would be if the single short-period candidate systems showed two or more giant planets, while multiple short-period 
candidate systems showed only zero or one giant planet. Such a difference might point the way to secular evolution 
and resonance playing a role in the disruption of inner planetary systems. 

Improved eccentricity constraints would also enable better constraints on the model. As showed above, the fit to 
the ^ distribution (Figure [14]) got better for larger Qdrc (i-e. stronger tidal damping in the planets). The current 
constraints on the eccentricities due to transit durations (Moorhead et al. 2011) find little evidence for circularisation. 
However, their contraint is weakened by uncertainty in the stellar parameters and strictly only applies for cooler stars 
than our nominal host. Wu & Lithwick (2012a) find that a significant fraction of their systems with strong TTVs 
show very small free eccentricities, consistent with a level of tidal dissipation similar to what we have assumed here 
(although there appears to be a minority of systems with significant eccentricities, which are harder to explain). 

5. CONCLUSIONS 

We have studied the assembly of planetary embryos into final planets for a disk of total mass 20M(^ (interior to 1 
AU) around a solar mass star. We quantify the distribution of final masses, eccentricities and inclinations, as well 
as the spacings of the final planets, and use these to generate a large population of simulated systems in a Monte 
Carlo model. The output from these simulations is used to quantify their detectability in a transit survey such as that 
conducted by the Kepler mission. 

Our principal conclusions are 

• The relative frequency of multiple transiting systems is well matched by the simulations. However, our model 
underpredicts the frequency of single tranet systems. If this is not due to unquantified selection effects, then it 
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Fig. 21. — The filled circles are the distribution of A values for the observational sample as defined in the text, for the Batalha et al. 
(2012) catalog. The values of A in each case were calculated using the mass-radius relation of Lll, and then divided by a factor of 0.64. 
This was to bring the observed break into line with that observed at A ~ 50 in the output from our Monte Carlo model (solid histogram). 
Systems with values larger than this are very likely to harbour an undetected planet between the observed pair. 

Fig. 22. — The solid histogram shows the cumulative distribution in A^ of the 11 candidate pair systems which pass our stellar and 
planetary cuts and show TTVs of more than Icr statistical significance. The dotted histogram shows the same but for the full sample of 
tranet pairs, selected in the same way except without taking account of TTV signatures. 

suggests that either an independent process produces mostly low multiplicity systems, or that some fraction of 
the planetary systems formed by our model undergo additional perturbations which reduce their multiplicity. 

• The distribution of mutual inclinations, quantified by the velocity- normalised transit duration ratio ^ (F12), is 
well matched by the simulations, especially if we allow for tidal damping of eccentricity for short period orbits. 
This is encouraging as it provides a physical basis for the observed inclination distribution of the Kepler tranet 
candidate sample. 

• Our default model matches the distribution of periods well for periods > 10 days, but overpredicts the number 
of short period tranets. This deficit can be corrected by allowing for a variation in the inner edge of the initial 
disk, as demonstrated in § 13.31 

• Our model underpredicts the number of close tranet pairs. This is the one observable that suggests that some 
extra (non-tidal) dissipation is required during the planetary assembly. At present there appear to be at least two 
possible ways to achieve this. One would be to cause a moderate amount of radial migration to bring pairs into 
closer alignment. Alternatively, it appears as though the observed period ratio distribution would be matched 
better if the relative fraction of high multiplicity to low multiplicity systems were increased in the Monte Carlo 
model. This could potentially be achieved with a moderate amount of inclination damping by interactions with 
a protoplanetary gas disk. However, too much damping would destroy the agreement with the ^ distribution, so 
it might be possible to place some interesting constraints on the conditions during planet assembly. 

• We suggest that most observed tranet pairs with > 30 contain additional non-transiting planets between 
them, and we find weak statistical evidence (limited by currently small sample sizes) that pairs with non-negligible 
TTV signatures are preferentially found with values above this threshold. Larger and improved data sets can 
help to refine this test of the model. 

Our assumption of a universal power law disk of fixed mass, which assembles by purely gravitational interactions, 
is clearly an oversimplification of what is expected to be a very complex physical problem. However, the broad 
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distributions of inclination, multiplicity and planetary spacing match well to both observations directly and to prior 
inferences based on ad hoc models. The fact that our model has a physical basis places some of the issues into starker 
contrast. In particular, the observed excess of single (or low multiplicity) tranetary systems suggests that additional 
processes are needed to provide a complete model. The fact that a purely gravitational process produces a final 
distribution that matches the observations places constraints on the amount of dissipation expected from interactions 
with the protoplanetary disk. Some level of dissipation is required to match the period ratio distribution, but it needs 
to be weak enough to destroy neither the broad spacing in semi-major axis nor the inclination distribution. 

Finally, our ability to fruitfully perform such analyses illustrates the value of large scale, homogeneously selected 
observational datasets. The requirement that our models match a broad range of observed systems promises to finally 
allow us to begin to place meaningful constraints on the mechanisms by which planetary systems form and assemble. 

N.M. is supported in part by NSERC of Canada and by the Canada Research Chair program. B.H. thanks the 
group at NASA Ames for discussions that stimulated the discussion in § 12.3.11 and both express their enthusiasm for 
the work done by the Kepler team in producing this wonderful data set. 
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Final States of Assembly Simulations 
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Candidates for Missing Planet Systems. 
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^ KOI 70.05 is a known candidate from Q1-Q6 data, lies in this gap but was not included in the analysis 
because Kp = l.O3i?0. 

^ Inclusion of Q1-Q8 data results in the discovery of KOI 116.04 in this gap. 

^ KOI 117.04 is a known candidate from Q1-Q6 data and lies in this gap, but was not included in the 
analysis because i?p = l.O7i?0. 

* Reported as having a significant TTV signature in Mazeh et al. 2013 
^ Reported as having a significant TTV signature in Mazeh et al. 2013 

® Inclusion of Ql Q8 data appears to render the KOI 316.01 period less certain. 

Mazeh ct al. (2013) report a significant TTV signature for 341.01 and also suggest that the real period may 
be half that reported. This would reduce the spacing and move the system below the Aj, = 30 threshold. 

* Candidate KOI 377.01 shows a strong TTV signal in the Ford et al. (2012) compilation, but this is most 
likely due to the proximity of the 377.01-377.02 pair to the 2:1 commensurability. 

^ KOI 448.02 shows a possible TTV signature in the Ford et al. (2012) compilation, and is also reported as 
having a significant TTV signature by Mazeh et al. (2013) 

Reported as having a significant TTV signature in Mazeh et al. 2013 

Inclusion of Ql— Q8 data adds a third, outer, candidate to the system, also with a large separation. This 
system could potentially hold two extra planets. 

Further analysis suggests that KOI 582.01 is a false positive. Discovery of a new candidate 582.03 results 
in a pair with A/^ = 21. 

KOI 593.02 shows a potential TTV signature in the Ford et al. (2012) compilation. Inclusion of Q1-Q8 
data results in an additional candidate, KOI 582.03, in the gap. 

Inclusion of Q1-Q8 data actually reveals two new tranets in this gap, while also reducing the radii of all 
four tranets below 1.5ii0. 
This pair was noted to have a possible anticorrelated TTV signature in Steffen et al. (2012). (S > 1) 



